Multiscale Scattering for Audio Classification

نویسندگان

  • Joakim Andén
  • Stéphane Mallat
چکیده

Mel-frequency cepstral coefficients (MFCCs) are efficient audio descriptors providing spectral energy measurements over short time windows of length 23 ms. These measurements, however, lose non-stationary spectral information such as transients or time-varying structures. It is shown that this information can be recovered as spectral co-occurrence coefficients. Scattering operators compute these coefficients with a cascade of wavelet filter banks and modulus rectifiers. The signal can be reconstructed from scattering coefficients by inverting these wavelet modulus operators. An application to genre classification shows that second-order cooccurrence coefficients improve results obtained by MFCC and Delta-MFCC descriptors. 1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Noise-Robust Texture Classification Method Using Joint Multiscale LBP

In this paper we describe a novel noise-robust texture classification method using joint multiscale local binary pattern. The first step in texture classification is to describe the texture by extracting different features. So far, several methods have been developed for this topic, one of the most popular ones is Local Binary Pattern (LBP) method and its variants such as Completed Local Binary...

متن کامل

Detection of Melanoma Skin Cancer by Elastic Scattering Spectra: A Proposed Classification Method

Introduction: There is a strong need for developing clinical technologies and instruments for prompt tissue assessment in a variety of oncological applications as smart methods. Elastic scattering spectroscopy (ESS) is a real-time, noninvasive, point-measurement, optical diagnostic technique for malignancy detection through changes at cellular and subcellular levels, especially important in ear...

متن کامل

Pairwise Decomposition with Deep Neural Networks and Multiscale Kernel Subspace Learning for Acoustic Scene Classification

We propose a system for acoustic scene classification using pairwise decomposition with deep neural networks and dimensionality reduction by multiscale kernel subspace learning. It is our contribution to the Acoustic Scene Classification task of the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE2016). The system classifies 15 different acoustic scenes. ...

متن کامل

Listening to the World Improves Speech Command Recognition

We study transfer learning in convolutional network architectures applied to the task of recognizing audio, such as environmental sound events and speech commands. Our key finding is that not only is it possible to transfer representations from an unrelated task like environmental sound classification to a voice-focused task like speech command recognition, but also that doing so improves accur...

متن کامل

Multiscale Sparse Microcanonical Models

We study density estimation of stationary processes defined over an infinite grid from a single, finite realization. Gaussian Processes and Markov Random Fields avoid the curse of dimensionality by focusing on low-order and localized potentials respectively, but its application to complex datasets is limited by their inability to capture singularities and long-range interactions, and their expe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011